Working Qwen2.5 Sample with Foundry Local
Important Note about Package Availability
The packages I initially used in the sample don’t match the actual available packages. Based on Microsoft Learn documentation, here are the correct packages and approach:
Fixed Basic Sample (Working)
The basic console sample has been updated with the correct packages:
- ✅
Microsoft.AI.Foundry.Local(version 0.1.0) - ✅
OpenAI(version 2.2.0-beta.4)
How to Run the Basic Sample
Navigate to the basic sample:
cd "c:\dev\Samples\Qwen25-FoundryLocal-Sample"Install packages:
dotnet restoreRun the application:
dotnet runOr run the simplified version:
dotnet run SimpleProgram.cs
About the Aspire Integration
The .NET Aspire integration shown in your session transcript uses packages that are likely in private preview or internal Microsoft builds:
Microsoft.Extensions.Hosting.FoundryLocal- Not yet publicly availableAspire.Azure.AI.Inference- Not yet publicly available
Expected Aspire Integration (When Available)
Based on your session transcript, the Aspire integration would work like this:
AppHost (when packages are available):
var foundryResource = builder.AddFoundryLocalResource("ai")
.AddModel("chat", "Qwen2.5-0.5B");
builder.AddProject<Projects.WebApp>()
.WithReference(foundryResource)
.WaitFor(foundryResource);Client App (when packages are available):
builder.Services.AddChatCompletionsClient("chat")
.AsOpenAIClient()
.UseFunctionCalling()
.UseOpenTelemetry();Current Working Approach
Until the Aspire packages are publicly available, you can:
- Use the basic console sample - This works with the current publicly available packages
- Create a manual Aspire setup - Start Foundry Local manually and connect your web app to it
- Use the OpenAI SDK directly - Connect to Foundry Local’s OpenAI-compatible endpoint
Model Aliases
Use these model aliases for optimal hardware selection:
qwen2.5-0.5b-instruct- Smallest, fastestqwen2.5-1.5b-instruct- Balanced performance
qwen2.5-3b-instruct- Highest quality
Sample Output
When you run the basic sample, you should see:
Starting Foundry Local service...
Foundry Local service started!
Service URI: http://localhost:5272
API Endpoint: http://localhost:5272/v1
Loading model: qwen2.5-0.5b-instruct
Model loaded successfully!
=== Chat Completion Example ===
AI Response: Running AI models locally offers several key benefits: cost savings since there are no cloud service fees, enhanced privacy as your data never leaves your device, offline capability without internet dependency, and complete control over processing speed based on your hardware capabilities.
=== Streaming Chat Example ===
Question: Explain local AI in 2 sentences.
AI Response (streaming): Local AI refers to running artificial intelligence models directly on your own device rather than sending data to cloud servers for processing. This approach provides better privacy, eliminates ongoing costs, and allows AI functionality to work offline while giving you complete control over your data.
Streaming completed!
Unloading model...
Stopping Foundry Local service...
Done!
Prerequisites
- Foundry Local installed - Follow the installation guide
- .NET 8.0 SDK or later
- Sufficient RAM - At least 4GB available for Qwen2.5-0.5B
- Good internet connection - For initial model download (~800MB)
Troubleshooting
If you get package errors:
- Make sure you’re using the exact package names and versions shown above
- Clear NuGet cache:
dotnet nuget locals all --clear - Try deleting
binandobjfolders and rundotnet restoreagain
If Foundry Local doesn’t start:
- Make sure Foundry Local is properly installed on your system
- Check that no other instances are running
- Verify you have sufficient system resources
Next Steps
- Try the basic sample first to ensure everything works
- Experiment with different model aliases
- Monitor for availability of official Aspire integration packages
- Consider building your own simple orchestration layer for web apps
The session transcript shows the future vision of seamless Aspire integration, but the current publicly available packages provide the foundation for local AI development.